438 research outputs found

    Scientific workflows for bibliometrics

    Get PDF
    Scientific workflows organize the assembly of specialized software into an overall data flow and are particularly well suited for multi-step analyses using different types of software tools. They are also favorable in terms of reusability, as previously designed workflows could be made publicly available through the myExperiment community and then used in other workflows. We here illustrate how scientific workflows and the Taverna workbench in particular can be used in bibliometrics. We discuss the specific capabilities of Taverna that makes this software a powerful tool in this field, such as automated data import via Web services, data extraction from XML by XPaths, and statistical analysis and visualization with R. The support of the latter is particularly relevant, as it allows integration of a number of recently developed R packages specifically for bibliometrics. Examples are used to illustrate the possibilities of Taverna in the fields of bibliometrics and scientometrics

    Protein expression dynamics during Escherichia Coli glucose-lactose diauxie

    Get PDF
    BACKGROUND: Escherichia coli is a well-studied anaerobic bacteria which is able to regulate metabolic pathways depending on the type of sugar presented in the medium. We have studied the glucose-lactose shift in E. coli at the protein level using a recently developed mass spectrometry platform. METHOD: Cells were grown in minimal medium containing two sugars (glucose and lactose) and analyzed using novel mass spectrometry cluster. The cluster combines the high resolving power and dynamic range of Fourier transform ion cyclotron resonance (FTICR) for accurate mass measurement and quantitation with multiple ion traps for fast and sensitive tandem mass spectrometry. The protein expression profile was followed in time across the glucose-lactose diauxic shift using label-free quantitation from the FTICR data. RESULTS AND CONCLUSION: The entire dataset was interrogated by KEGG pathway analysis, mapping measured changes in protein abundance onto known metabolic pathways. The obtained results were consistent with previously published gene expression data, with β-galactosidase being the most strongly induced protein during the diauxic shift

    Theoretical considerations for next-generation proteomics

    Get PDF
    While mass spectrometry still dominates proteomics research, alternative and potentially disruptive, next-generation technologies are receiving increased investment and attention. Most of these technologies aim at the sequencing of single peptide or protein molecules, typically labeling or otherwise distinguishing a subset of the proteinogenic amino acids. This note considers some theoretical aspects of these future technologies from a bottom-up proteomics viewpoint, including the ability to uniquely identify human proteins as a function of which and how many amino acids can be read, enzymatic efficiency, and the maximum read length. This is done through simulations under ideal and non-ideal conditions to set benchmarks for what may be achievable with future single-molecule sequencing technology. The simulations reveal, among other observations, that the best choice of reading N amino acids performs similarly to the average choice of N+1 amino acids, and that the discrimination power of the amino acids scales with their frequency in the proteome. The simulations are agnostic with respect to the next-generation proteomics platform, and the results and conclusions should therefore be applicable to any single-molecule partial peptide sequencing technology.Proteomic

    Rewinding the molecular clock: looking at pioneering molecular phylogenetics experiments in the light of proteomics

    Get PDF
    Science is full of overlooked and undervalued research waiting to be rediscovered. Proteomics is no exception. In this perspective, we follow the ripples from a 1960 study of Zuckerkandl, Jones, and Pauling comparing tryptic peptides across animal species. This pioneering work directly led to the molecular clock hypothesis and the ensuing explosion in molecular phylogenetics. In the decades following, proteins continued to provide essential clues on evolutionary history. While technology has continued to improve, contemporary proteomics has strayed from this larger biological context, rarely comparing species or asking how protein structure, function, and interactions have evolved. Here we recombine proteomics with molecular phylogenetics, highlighting the value of framing proteomic results in a larger biological context and how almost forgotten research, though technologically surpassed, can still generate new ideas and illuminate our work from a different perspective. Though it is infeasible to read all research published on a large topic, looking up older papers can be surprisingly rewarding when rediscovering a "gem" at the end of a long citation chain, aided by digital collections and perpetually helpful librarians. Proper literature study reduces unnecessary repetition and allows research to be more insightful and impactful by truly standing on the shoulders of giants.Proteomic

    “Lossless” compression of high resolution mass spectra of small molecules

    Get PDF
    Fourier transform ion cyclotron resonance (FTICR) provides the highest resolving power of any commercially available mass spectrometer. This advantage is most significant for species of low mass-to-charge ratio (m/z), such as metabolites. Unfortunately, FTICR spectra contain a very large number of data points, most of which are noise. This is most pronounced at the low m/z end of spectra, where data point density is the highest but peak density low. We therefore developed a filter that offers lossless compression of FTICR mass spectra from singly charged metabolites. The filter relies on the high resolving power and mass measurement precision of FTICR and removes only those m/z channels that cannot contain signal from singly charged organic species. The resulting pseudospectra still contain the same signal as the original spectra but less uninformative background. The filter does not affect the outcome of standard downstream chemometric analysis methods, such as principal component analysis, but use of the filter significantly reduces memory requirements and CPU time for such analyses. We demonstrate the utility of the filter for urinary metabolite profiling using direct infusion electrospray ionization and a 15 tesla FTICR mass spectrometer

    In Conversation with Mubin Shaikh: From Salafi Jihadist to Undercover Agent inside the "Toronto 18" Terrorist Group

    Get PDF
    This interview with former undercover agent Mubin Shaikh can help academics and security practitioners understand the key role played and the challenges faced by covert human intelligence sources within domestic terrorist groups. The interview highlights the identity crisis, the personal factors, and the allure of jihadi militancy that initially drove Shaikh to join a Salafi jihadist group. It investigates Shaikh’s process of disengagement from the Salafi jihadist belief system and his rediscovery of a moderate, inclusive, and benevolent form of Islam. It explores his work as an undercover agent for the Canadian Security Intelligence Service, the Royal Canadian Mounted Police, and the Integrated National Security Enforcement Team responsible for disrupting domestic terrorist groups. The “Toronto 18” terrorist cell, the key role played by undercover agents in preventing terrorist action, and the challenges posed by entrapment are also discussed

    Use of expressed sequence tags as an alternative approach for the identification of Taenia solium metacestode excretion/secretion proteins

    Get PDF
    BACKGROUND: Taenia solium taeniasis/cysticercosis is a zoonotic helminth infection mainly found in rural regions of Africa, Asia and Latin America. In endemic areas, diagnosis of cysticercosis largely depends on serology, but these methods have their drawbacks and require improvement. This implies better knowledge of the proteins secreted and excreted by the parasite. In a previous study, we used a custom protein database containing protein sequences from related helminths to identify T. solium metacestode excretion/secretion proteins. An alternative or complementary approach would be to use expressed sequence tags combined with BLAST and protein mapping to supercontigs of Echinococcus granulosus, a closely related cestode. In this study, we evaluate this approach and compare the results to those obtained in the previous study. FINDINGS: We report 297 proteins organized in 106 protein groups based on homology. Additional classification was done using Gene Ontology information on biological process and molecular function. Of the 106 protein groups, 58 groups were newly identified, while 48 groups confirmed previous findings. Blast2GO analysis revealed that the majority of the proteins were involved in catalytic activities and binding. CONCLUSIONS: In this study, we used translated expressed sequence tags combined with BLAST and mapping strategies to both confirm and complement previous research. Our findings are comparable to recent studies on other helminth genera like Echinococcus, Schistosoma and Clonorchis, indicating similarities between helminth excretion/secretion proteomes

    APE in the wild: automated exploration of proteomics workflows in the bio.tools registry

    Get PDF
    The bio.tools registry is a main catalogue of computational tools in the life sciences. More than 17 000 tools have been registered by the international bioinformatics community. The bio.tools metadata schema includes semantic annotations of tool functions, that is, formal descriptions of tools' data types, formats, and operations with terms from the EDAM bioinformatics ontology. Such annotations enable the automated composition of tools into multistep pipelines or workflows. In this Technical Note, we revisit a previous case study on the automated composition of proteomics workflows. We use the same four workflow scenarios but instead of using a small set of tools with carefully handcrafted annotations, we explore workflows directly on bio.tools. We use the Automated Pipeline Explorer (APE), a reimplementation and extension of the workflow composition method previously used. Moving "into the wild" opens up an unprecedented wealth of tools and a huge number of alternative workflows. Automated composition tools can be used to explore this space of possibilities systematically. Inevitably, the mixed quality of semantic annotations in bio.tools leads to unintended or erroneous tool combinations. However, our results also show that additional control mechanisms (tool filters, configuration options, and workflow constraints) can effectively guide the exploration toward smaller sets of more meaningful workflows.Proteomic
    • …
    corecore